AITopics | metric entropy

Collaborating Authors

metric entropy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ab54dd3cd5e273dbb1e75a5e6203fb42-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 08:34:28 GMT

artificial intelligence, assumption, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Dylan J. Foster, Akshay Krishnamurthy

Neural Information Processing SystemsFeb-12-2026, 03:01:04 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, bandit, contextual bandit, (11 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Does Flatness imply Generalization for Logistic Loss in Univariate Two-Layer ReLU Network?

Qiao, Dan, Wang, Yu-Xiang

arXiv.org Machine LearningDec-2-2025

We consider the problem of generalization of arbitrarily overparameterized two-layer ReLU Neural Networks with univariate input. Recent work showed that under square loss, flat solutions (motivated by flat / stable minima and Edge of Stability phenomenon) provably cannot overfit, but it remains unclear whether the same phenomenon holds for logistic loss. This is a puzzling open problem because existing work on logistic loss shows that gradient descent with increasing step size converges to interpolating solutions (at infinity, for the margin-separable cases). In this paper, we prove that the \emph{flatness implied generalization} is more delicate under logistic loss. On the positive side, we show that flat solutions enjoy near-optimal generalization bounds within a region between the left-most and right-most \emph{uncertain} sets determined by each candidate solution. On the negative side, we show that there exist arbitrarily flat yet overfitting solutions at infinity that are (falsely) certain everywhere, thus certifying that flatness alone is insufficient for generalization in general. We demonstrate the effects predicted by our theory in a well-controlled simulation study.

converge, generalization, theorem 3, (13 more...)

arXiv.org Machine Learning

2512.01473

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Dylan J. Foster, Akshay Krishnamurthy

Neural Information Processing SystemsNov-20-2025, 23:18:16 GMT

We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning.

algorithm, artificial intelligence, machine learning, (12 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Industry: Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Generalization Bounds in Hybrid Quantum-Classical Machine Learning Models

Wu, Tongyan, Bentellis, Amine, Sakhnenko, Alona, Lorenz, Jeanette Miriam

arXiv.org Artificial IntelligenceNov-13-2025

Hybrid classical-quantum models aim to harness the strengths of both quantum computing and classical machine learning, but their practical potential remains poorly understood. In this work, we develop a unified mathematical framework for analyzing generalization in hybrid models, offering insight into how these systems learn from data. We establish a novel generalization bound of the form $\tilde{\mathcal O}\left( \tfrac{α^{k}}{\sqrt{N}}\, \big( k^{\tfrac{3}{2}}\sqrt{m n}\;+\;\sqrt{T\log T}\big) \right)$ for $N$ training data points, $T$ trainable quantum gates, $n$ dimensional quantum circuit output, and $k$ bounded linear layers $ \|F_i\|_F \leq α$ where $ i = 1, \dots, k $ and $F_i \in \mathbb{R}^{m \times n} $ interspersed with activation functions. This generalization bound decomposes into quantum and classical contributions, providing a theoretical framework to separate their influence and clarifying their interaction. Alongside the bound, we highlight conceptual limitations of applying classical statistical learning theory in the hybrid setting and suggest promising directions for future theoretical work.

artificial intelligence, generalization, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.08456

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Approximation Rates of Shallow Neural Networks: Barron Spaces, Activation Functions and Optimality Analysis

Lu, Jian, Huang, Xiaohuang

arXiv.org Artificial IntelligenceOct-22-2025

This paper investigates the approximation properties of shallow neural networks with activation functions that are powers of exponential functions. It focuses on the dependence of the approximation rate on the dimension and the smoothness of the function being approximated within the Barron function space. We examine the approximation rates of ReLU$^{k}$ activation functions, proving that the optimal rate cannot be achieved under $\ell^{1}$-bounded coefficients or insufficient smoothness conditions. We also establish optimal approximation rates in various norms for functions in Barron spaces and Sobolev spaces, confirming the curse of dimensionality. Our results clarify the limits of shallow neural networks' approximation capabilities and offer insights into the selection of activation functions and network structures.

approximation rate, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.18388

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Austria (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

Neural Information Processing SystemsOct-10-2025, 12:48:21 GMT

We study the generalization of two-layer ReLU neural networks in a univariate nonparametric regression problem with noisy labels. This is a problem where kernels ( e.g.

assumption, neural network, probability 1, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Partially Observable Reinforcement Learning with Memory Traces

Eberhard, Onno, Muehlebach, Michael, Vernade, Claire

arXiv.org Artificial IntelligenceMar-19-2025

Partially observable environments present a considerable computational challenge in reinforcement learning due to the need to consider long histories. Learning with a finite window of observations quickly becomes intractable as the window length grows. In this work, we introduce memory traces. Inspired by eligibility traces, these are compact representations of the history of observations in the form of exponential moving averages. We prove sample complexity bounds for the problem of offline on-policy evaluation that quantify the value errors achieved with memory traces for the class of Lipschitz continuous value estimates. We establish a close connection to the window approach, and demonstrate that, in certain environments, learning with memory traces is significantly more sample efficient. Finally, we underline the effectiveness of memory traces empirically in online reinforcement learning experiments for both value prediction and control.

machine learning, memory trace, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2503.152

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Robust density estimation over star-shaped density classes

Liu, Xiaolong, Neykov, Matey

arXiv.org Machine LearningJan-17-2025

We establish a novel criterion for comparing the performance of two densities, $g_1$ and $g_2$, within the context of corrupted data. Utilizing this criterion, we propose an algorithm to construct a density estimator within a star-shaped density class, $\mathcal{F}$, under conditions of data corruption. We proceed to derive the minimax upper and lower bounds for density estimation across this star-shaped density class, characterized by densities that are uniformly bounded above and below (in the sup norm), in the presence of adversarially corrupted data. Specifically, we assume that a fraction $\epsilon \leq \frac{1}{3}$ of the $N$ observations are arbitrarily corrupted. We obtain the minimax upper bound $\max\{ \tau_{\overline{J}}^2, \epsilon \} \wedge d^2$. Under certain conditions, we obtain the minimax risk, up to proportionality constants, under the squared $L_2$ loss as $$ \max\left\{ \tau^{*2} \wedge d^2, \epsilon \wedge d^2 \right\}, $$ where $\tau^* := \sup\left\{ \tau : N\tau^2 \leq \log \mathcal{M}_{\mathcal{F}}^{\text{loc}}(\tau, c) \right\}$ for a sufficiently large constant $c$. Here, $\mathcal{M}_{\mathcal{F}}^{\text{loc}}(\tau, c)$ denotes the local entropy of the set $\mathcal{F}$, and $d$ is the $L_2$ diameter of $\mathcal{F}$.

artificial intelligence, lemma 2, minimax rate, (17 more...)

arXiv.org Machine Learning

2501.10025

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.71)

Add feedback

Which Spaces can be Embedded in $L_p$-type Reproducing Kernel Banach Space? A Characterization via Metric Entropy

Lu, Yiping, Lin, Daozhe, Du, Qiang

arXiv.org Machine LearningOct-15-2024

In this paper, we establish a novel connection between the metric entropy growth and the embeddability of function spaces into reproducing kernel Hilbert/Banach spaces. Metric entropy characterizes the information complexity of function spaces and has implications for their approximability and learnability. Classical results show that embedding a function space into a reproducing kernel Hilbert space (RKHS) implies a bound on its metric entropy growth. Surprisingly, we prove a \textbf{converse}: a bound on the metric entropy growth of a function space allows its embedding to a $L_p-$type Reproducing Kernel Banach Space (RKBS). This shows that the ${L}_p-$type RKBS provides a broad modeling framework for learnable function classes with controlled metric entropies. Our results shed new light on the power and limitations of kernel methods for learning complex function spaces.

kernel banach space, reproducing kernel banach space, type reproducing kernel banach space, (8 more...)

arXiv.org Machine Learning

2410.11116

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback